5 research outputs found
A Convolutional Neural Network model based on Neutrosophy for Noisy Speech Recognition
Convolutional neural networks are sensitive to unknown noisy condition in the
test phase and so their performance degrades for the noisy data classification
task including noisy speech recognition. In this research, a new convolutional
neural network (CNN) model with data uncertainty handling; referred as NCNN
(Neutrosophic Convolutional Neural Network); is proposed for classification
task. Here, speech signals are used as input data and their noise is modeled as
uncertainty. In this task, using speech spectrogram, a definition of
uncertainty is proposed in neutrosophic (NS) domain. Uncertainty is computed
for each Time-frequency point of speech spectrogram as like a pixel. Therefore,
uncertainty matrix with the same size of spectrogram is created in NS domain.
In the next step, a two parallel paths CNN classification model is proposed.
Speech spectrogram is used as input of the first path and uncertainty matrix
for the second path. The outputs of two paths are combined to compute the final
output of the classifier. To show the effectiveness of the proposed method, it
has been compared with conventional CNN on the isolated words of Aurora2
dataset. The proposed method achieves the average accuracy of 85.96 in noisy
train data. It is more robust against Car, Airport and Subway noises with
accuracies 90, 88 and 81 in test sets A, B and C, respectively. Results show
that the proposed method outperforms conventional CNN with the improvement of
6, 5 and 2 percentage in test set A, test set B and test sets C, respectively.
It means that the proposed method is more robust against noisy data and handle
these data effectively.Comment: International conference on Pattern Recognition and Image Analysis
(IPRIA 2019
Optimized discriminative transformations for speech features based on minimum classification error
Feature extraction is an important step in pattern classification and speech recognition. Extracted features should discriminate classes from each other while being robust to the environmental conditions such as noise. For this purpose, some transformations are applied to features. In this paper, we propose a framework to improve independent feature transformations such as PCA (Principal Component Analysis), and HLDA (Heteroscedastic LDA) using the minimum classification error criterion. In this method, we modify full transformation matrices such that classification error is minimized for mapped features. We do not reduce feature vector dimension in this mapping. The proposed methods are evaluated for continuous phoneme recognition on clean and noisy TIMIT. Experimental results show that our proposed methods improve performance of PCA, and HLDA transformation for MFCC in both clean and noisy conditions
Discriminative transformation for speech features based on genetic algorithm and HMM likelihoods
Hidden Markov Model (HMM) is a well-known classification approach which its parameters are conventionally learned using maximum likelihood (ML) criterion based on expectation maximization algorithm. Improving of parameter learning beyond ML has been performed based on the concept of discrimination among classes in contrast to maximizing likelihood of each individual class. In this paper, we propose a discriminative feature transformation method based on genetic algorithm, to increase Hidden Markov Model likelihoods in its training phase for a better class discrimination. The method is evaluated for phoneme recognition on clean and noisy TIMIT. Experimental results demonstrate that the proposed transformation method results in higher phone recognition rate than well-known feature transformations methods and conventional HMM learning algorithm based on ML criterion